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This article examines how diseases on random networks spread in time. The disease is described by a proba- 
bility distribution function for the number of infected and recovered individuals, and the probability distribution 
is described by a generating function. The time development of the disease is obtained by iterating the gener- 
ating function. In cases where the disease can expand to an epidemic, the probability distribution function is 
the sum of two parts; one which is static at long times, and another whose mean grows exponentially. The time 
development of the mean number of infected individuals is obtained analytically. When epidemics occur, the 
probability distributions are very broad, and the uncertainty in the number of infected individuals at any given 
time is typically larger than the mean number of infected individuals. 



Introduction A series of papers by Strogatz, Watts 1 1], 
VespignaniQH, Meyersjl |l 0, Newman Im. H IsSl fiol flTB , 
Stanley|12], Barabasi|13] and collaborators applies methods 
from graph theory and percolation theory to the spread of dis- 
ease on random networks. These papers mainly study the final 
state of a population once the disease has run its course, with 
all individuals susceptible but uninfected, or recovered. Here 
I show how to apply the same analytical techniques to dy- 
namics of the epidemic and find how the number of infected 
individuals varies in time. 

A starting point for this study was to clear up a technical 
point arising when an epidemic is possible, but not certain. 
Newman, Strogatz and Watts|9] find a probability distribution 
function Pk that k individuals have been infected, and they 
show that u = £\ Pk < 1. They determine u from a self- 
consistent equation, and interpret this distribution function as 
describing the probability of a finite outbreak that does not 
grow to system size. The remaining probability is contained 
in an outbreak that fills the whole system. This interpretation 
is puzzling. Since k can have any size, why does Pk describe 
only finite outbreaks? How does the self-consistent equations 
determining u figure out how to find only these finite out- 
breaks, and discard the larger ones? The authors assert that 
the system-size outbreaks would contain loops that invalidate 
the formalism they are employing, but how does the formal- 
ism know this? These questions are resolved when one exam- 

(n) 

ines the probability distribution after n times steps, P., ' . One 



broad tail of the distribution that has formed out in front of 
Qk- Techniques essentially identical to those used previously 
to describe Qk can be used to analyze R^ 1 ' . In particular, one 
can find closed-form expressions for the mean number of peo- 
ple infected at time n. When an epidemic is possible, both the 

(n) 

mean and width of R k ' grow exponentially in time. In gen- 
eral, ones uncertainty about precisely how many people will 
be infected in the future grows as fast as or faster than the 
number of diseased individuals. 

Dynamical Equations Consider a random network in 
which the probability distribution of nodes with k edges is 
Pk- Following Newman, Strogatz, and Watts 1 9], the generat- 
ing function for the distribution of nodes is 
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Consider choosing a random edge in the system. The proba- 
bility that the node reached by this edge will have k new edges 
in addition to the one chosen to start with is generated by 



Gi(x) 



G'q(x) 

G' (iy 



(2) 



Consider conventional Susceptible-Infected-Recovered dy- 
namics on this network|9|. At each time step, uninfected 

nodes connected by an edge to infected nodes become infected 

(n) 



finds that the probability distribution is the sum of two pieces. in turn - Let P k S ive the probability that a grand total of k 

individuals has been infected after n time steps, and let the 



The first piece Q k converges to a time-independent function 
Qk in the long-time limit, with J2kQ k < 1- The second piece 

(n) 

R k ' never stops evolving. Its mean and width grow exponen- 

(n) 

tially. So long as the mean of R, ' is much smaller than the 
total system size, it can be described by standard generating 
function techniques, and this description is not invalidated by 
the presence of loops. Thus, the generating function formal- 
ism has been finding Qk and the reason this function emerges 

(n) 

is that P k converges to Qk pointwise, although at any given 

(n) 

time step n a finite fraction of P k is contained in a very 



(n) 

generating function for P k ' be 
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Imagine starting with a single infected individual. At step 0, 
one has = x. At the next time step, the generating func- 
tion for the total number of individuals infected is 



H {1 \x) =xG (x), 



(4) 
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since Gq{x) gives the probability that a given node has 0, 1, 
2, . . . edges, and one multiplies by x because one began with 
one infected individual. Each of the edges departing the first 
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one reaches some other node. The probability it will have 
k additional edges leaving it is given by G\(x). Using the 
powers property in Section IIA of Ref. |9], one has 



ff (2) (x) =xG (xG 1 (x)). 
Continuing in this fashion, one has 

H^ n \x)^H i - n ~ 1 \xG 1 {x)). 



(5) 



(6) 



define instead 



F<°>(a:) = 1 

F {n) {x) = GxixF^-^ix)) 
H^(x) = xGo(xF<> n -V{x)) 



To extract the probability distribution function from a gener- 
ating function H(z), note that from Cauchy's theorem 

P « = ^i -£r# (*) = r d6 e-™ ke H{e™). (8) 

ZTTl J Z Jq 

Suppose now that H has been evaluated around the unit circle 
at M points, with 6 t = l/M, I E [0, M - 1], and let 



Hi = H(e 27rl9 >) 



Then one has 



M-l 



Pfe = 17 E e~ 2 ^ M H t = -L D FT(/f,-l)[fc 



(9) 



(10) 
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where the last expression means that one takes the fc'th ele- 
ment of the inverse discrete Fast Fourier Transform. Using 
Eq. f7cl and employing Eq. d 1 Oft to obtain probabilities Pk, 
one easily obtains hundreds of iterates of the map, for hun- 
dreds of thousands of values of k. 

Static and growing distributions Some results of solving 
Eqs. ( FTcl appear in Figure^ Figure[2(A) shows distributions 
resulting from the polynomial Gq(x) = .7x + .2x 2 + .05x 3 + 
.04a; 4 + .Ola; 5 . The threshhold for an epidemic is determined 
by z 2 > zi, 0HC1II1 where z x = G' (l) is the aver- 
age number of neighbors of each node, and z 2 = G[(l)zi 
is the average number of second neighbors. In the present 
case, z\ = 1.46 and z 2 = 1.38, so the infection is contained, 
and the probability distribution converges to a definite limit 
enclosing unit probability. The upper curve shows the cumu- 
lative sum Sk = J2k'=i -ffe' 100 ''- 'Th e mean number of people 
infected after 100 iterations is 27, but the distribution is broad; 
for example, there is a 1% chance that more than 480 people 
will be infected. Figure^(B) shows results from the polyno- 
mial G Q (x) = .7x + Ax 2 + .05a; 3 + .01a; 4 + .14a; 5 , which 
gives z\ = 1.79, z 2 = 3.42. Since z 2 > Z\, an epidemic 
is possible. One can compute the probability of an epidemic 
spiraling out of control following |9]; also see Eq. d!2i . There 
is a root of G x (tt) - u at u = .492 and G (u) = 0.3790. This 
computation predicts a 37.9% chance that the disease will run 
its course without becoming an epidemic. The upper curve in 




1000 



c, so 




0.1 






0.01 




1 


0.001 


(7a) 


XI 

c 




£ 


1 x 10- 4 


(7b) 




1 x 10~ 5 


(7c) 




1 x 10-" 



1 

0.1 
0.01 
0.001 

1 x 10~ 4 
1 x 10- 5 




1 x 10 5 



100 1000 
Infected individuals k 



10000 



Figure 1: (A)Dynamical evolution of Eq.f7cl in a case where the av- 
erage number of second neighbors zi is less than the average number 
of neighbors z\ and there is no epidemic. The map is iterated 100 
times. (B) Dynamical evolution of Eq.f7cl in case where zn > z\, so 
one expects the existence of a giant component. The map is iterated 
12 times. (C) Similar to (B), but now using a broader probability 
distribution. The epidemic grows quickly and only 7 iterations are 
displayed. 
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Figure d(B) shows the cumulative sum Sk = 
and there is a broad plateau where this sum has reached .38. 
The mean number of infected individuals after 1 1 iterations is 
4650 but there is a 1 % chance more than 26000 people will be 
infected. Figure[2(C) uses the probability distribution po = 0, 
Pk oc k~ a e~ k l K with a = 2 and k = 20. Now z\ = 1.8, 
z 2 = 5.3, and the epidemic grows even more rapidly. There is 
a 41% chance that the epidemic will be contained. The mean 
number of infected individuals after 7 steps is 5500, but there 
is a 1 % chance that more than 50,000 will be infected. 

Thus when there is the possibility of an epidemic, the prob- 
ability distribution does indeed split into two components. 
The first component Qk is static in the long-time limit and de- 
scribes the probability that spread of disease terminates with 
a number of infected individuals much smaller than the total 

(n) 

population. The second component R k continues to evolve 
forever. From a formal point of view, the definition of Qk is 

Qk= lim [ d9e- 2wike H^ n \e 2wie ). (11) 



(n) 

For any fixed fc, this limit converges. Then R k can be de 
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Figure 2: Decomposition of the data in FigureQ(B) into static and 
growing components Qk and . This is done by computing -P^ 11 ' , 

setting Qk = Pj: f° r ^ — 32, fitting Q to a power law for for 
k > 32, and subtracting Qk obtained this way from distributions 
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fined as R ( k n) = - Q fc . One can similarly decompose 
the probability distribution resulting from into static and 
evolving components. To see now how the probability of 
not participating in the epidemic emerges from self-consistent 
equations, define F°°(x) = linin^oo F^ n \x) . This limit ex- 
ists for any x < 1, since large powers of x < 1 in the power 
series for F^ n ' suppress the parts of F^ that are continuing 
to evolve. Return to (17b > . and write 



lim lim F^(x)-G 1 (xF ( - n -^) 

x > 1 n— >oo 

=► lim F°°(x) - GUxF 00 ) = t 

X— >1 







=>u = Gi(u) with u=mnF°°{x). (12) 

Finally Gq(u) = lim^^i lim^oo H^ n \x) gives the proba- 
bility that the disease does not spiral into an epidemic. 

Figure[2]shows an explicit decomposition of the data in Fig- 
ure [2(B) into components Q and R. The task is carried out 
by taking the final curve in^(B) and noticing that it has con- 
verged to a static value up to around k = 32 (the precise cut 
point does not matter much) and is continuing to evolve for 
larger k. For k > 32, Qk is estimated by a power-law fit. The 
area under Qk found this way is .3791 which compares well 
with the predicted value of .3790. 

Size of infected cluster One can work out analytically the 
average size of the infected/recovered cluster as a function of 
time. Note that (1) = land let 



M n = —F™(x)\ x=1 . 
ax 



Then 

M n =G' 1 {l)[F^ n - 1 \l) + M n _ l ] = ^(l + Af B _i). (13) 

Z\ 



Using Mo = 0, one can solve this iterated map exactly as a 
power series, which has the compact final expression 



M„ 



n-l 



I 



(1) 


_ Z 2 




Zl 



1 - (Z2/Z 1 ) n 



1 - Zz/Zl 



(14) 



Then the average number of individuals in the cluster is 



<*> w = -f^(*)l*=i = 1+*(1+^ f 1 -/^"" 1 
ax zi I 1 — za/^i 

(15) 

If z 2 < ^i , one obtains the expected result 1 8, 14, 15]for large 
n that 



(A) = 1 + ^(1 + 



-2 



zi - z 2 



-) = 1 + 



^1 — z 2 



(16) 



In the opposite case, z 2 > Zi, Eq. (I14> becomes M„ w 
(^2/zi)" +1 /(l — z<xjz\) and for large n the average size of 
the infected population is 



(k) n 



zi{z 2 /zi) n 
z 2 /zi - 1 ' 



(17) 



The width of the distribution is proportional to the mean. The 
dominant contribution to (k 2 ) n at large n is 



(z 2 /zi) n 
z 2 /zi - 1 



'z 2 



z\ + 



z 2 G'{{\) 
(z 2 /zi - 1) 



(18) 

When infection is not certain across an edge Newman| 8] 
describes the case where infection is not certain across an edge 
connecting two nodes, but occurs with probability T. In this 
case, the probability of infecting neighbors starting with a ran- 
domly chosen node is generated by 



G (l + T(x-1)), 



(19) 



the probability of infecting neighbors starting with a randomly 
chosen edge, excluding the incoming edge is generated by 



Gi(l + T(i-1)), 



(20) 



and employing these two generating functions, the evolution 
equationsflTcJ are unchanged, and (1151 for the average degree 
of infection generalizes to 



ifT - z 2 T(z 2 T/ Zl y 
zi - z 2 T 



<fc>„+i = -^ff (n+1) i>)U=i = i 

ax 

(21) 

Essentially z 2 is replaced by Tz 2 . 

Individuals added at each time step Another interesting 
quantity to track is the probability of adding individuals of 
varying degree number k at each time step. This can be done 
by adding a subscript to the variable x that tracks the time step 
at which an individual has entered the cluster. Doing so one 
has 



H^(x Q ) = x . 
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Figure 3: This plot of {k)P K ^, using the generating functions from 
FigureQ(B) iterated 10 times does appear to be converging on a scal- 
ing form. Convergence for small values of k — k/{k) n is pointwise 
on a logarithmic scale (A) and uniform on a linear scale (B). 



H^ixo^xi) = x G (xi), 



(22) 



H {2 \x 0l xi 1 X2) = x G (xiGi(x 2 )), (23) 

Continuing in this fashion, one has 

HW(g)=H ( - n -V(x ,x 1 ,...x n - 1 G 1 (x n )). (24) 

One recovers the results in Eq. ( fTcl by removing all the 
indices from the variables x. To focus upon the individuals 
added to the cluster at step n, just set all variables xi to 1 ex- 
cept the last. Denote by the probability that k individuals 



have been added at time step n, and let J(x) be the generating 
function for this probability. Then 

jW(x) = G (x); J {n \x) = J^iG^x)). 

One can now calculate the mean number of people added at 
each time step, 



Ski 



Zl 



dk 2 = — zi 

Zl 



. 5k n 
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Scaling form for epidemic It would seem natural for the 
growing part of the probability distribution to adopt a scal- 
ing form at long times. To capture the growing part of the 
distribution, one computes 



R 



in) 



1 



(fc), 



■R(k) where n = k/(k) n . 



(25) 



As shown in Figure [3] this scaling form does appear to de- 
scribe R after sufficiently many iterations, On a log scale the 
tail of R for small n = kj (k) n converges pointwise, but on a 
linear scale convergence is uniform. 
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